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ABSTRACT 

The New York City Clinical Data Research Network 
(NYC-CDRN), funded by the Patient-Centered Outcomes 
Research Institute (PCORI), brings together 22 
organizations including seven independent health 
systems to enable patient-centered clinical research, 
support a national network, and facilitate learning 
healthcare systems. The NYC-CDRN includes a robust, 
collaborative governance and organizational 
infrastructure, which takes advantage of its participants' 
experience, expertise, and history of collaboration. The 
technical design will employ an information model to 
document and manage the collection and transformation 
of clinical data, local institutional staging areas to 
transform and validate data, a centralized data 
processing facility to aggregate and share data, and use 
of common standards and tools. We strive to ensure that 
our project is patient-centered; nurtures collaboration 
among all stakeholders; develops scalable solutions 
facilitating growth and connections; chooses simple, 
elegant solutions wherever possible; and explores ways 
to streamline the administrative and regulatory approval 
process across sites. 



INTRODUCTION 

New York City is home to one of the largest, most 
diverse urban populations in the USA, including 
more than 8 million people with a wide range of 
socioeconomic and health characteristics. 1 Its 
healthcare system is marked by a concentration of 
academic medical centers with expertise in clinical 
care, research, and education. Despite this wealth 
of resources, healthcare delivery remains fragmen- 
ted, as patients often receive care from multiple 
institutions, complicating efforts to conduct 
research, manage population health, and develop 
learning healthcare systems. 

Funded by the Patient-Centered Outcomes 
Research Institute (PCORI), the New York City 
Clinical Data Research Network (NYC-CDRN) was 
formed to create an accessible, sustainable, scalable 
clinical data network that will enable patient-centered 
research, support a national research network, and 
facilitate the development of learning healthcare 
systems. This project features a unique collaboration 
across 22 organizations, including seven independent 
health systems, which will create unprecedented 
opportunities for city- and nation-wide population 



health management, patient-centered clinical trials, 
observational studies, and precision medicine. Specific 
goals include aggregating data on a minimum of 1 
million patients, engaging patients and front-line clini- 
cians in all phases of the project, embedding research 
activity into the delivery of healthcare, aligning regu- 
latory oversight across multiple health systems, and 
disseminating study results across healthcare systems. 

This paper describes the project's goals, governance 
and organizational structure, and technical approach. 

ORGANIZATIONAL AND SCIENTIFIC APPROACH 

The NYC-CDRN includes a robust and collabora- 
tive governance and organizational infrastructure, 
which takes advantage of its participants' experi- 
ence, expertise, and history of collaboration. 

Participating institutions 

The NYC-CDRN's participating institutions (table 1) 
have several notable features that provide an import- 
ant foundation for the consortium. The NYC-CDRN 
includes six Clinical and Translational Science Award 
(CTSA) centers, 2 which already collaborate on 
research, data sharing, and patient engagement. 
Second, the participating health systems — including 
five medical schools, four affiliated health systems, 
and one practice-based research network of federally 
qualified health centers -have robust electronic 
health records (EHRs) and clinical data warehouses 
with many years of data. Third, the New York 
Genome Center (NYGC), an independent non-profit 
entity, with which all health systems are affiliated, has 
important expertise in genomic data and acts as a 
neutral party and 'honest broker' 3 for aggregating 
and hosting data from competing institutions for 
research purposes. In addition, two regional health 
information organizations, Healthix 1 and the Bronx 
RHIO's Bronx Regional Informatics Center (BRIC), 
provide important expertise in patient matching and 
record de-duplication. Cornell NYC Tech, a new 
graduate school emphasizing technology and entre- 
preneurship, provides access to new methods for col- 
lecting patient-generated data. The Biomedical 
Research Alliance of New York (BRANY) serves as 
the centralized institutional review board (IRB) 
process to ensure appropriate regulatory oversight 
and protocol reviews. Finally, several patients and 
patient advocacy groups provide important expertise 
in patient engagement. 
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Table 1 NYC-CDRN participating institutions 

Patients in 



Partner 


Organization 


tnrwnit piaiTorm 


cup /um* 
tnrvnl t 


Hpalth <;\/<;tpm 

llCal LI 1 jyjLCIII 


riinir^l Dirprtnr<; Nptwork f(~DI\n 

v^iiin^.ai Lyncv-iuij iiicivvuiix \ v- l-/ i m / 


pCliniralWork*; 

cv~iii iiv-cii vvui ro, 

GE Centricity 


250k 




Columbia University College of Physicians and Surgeons (CUCPS)t 


Allscripts Enterprise 


767k 




Montefiore Medical Center and Albert Einstein College of Medicine 


GE Centricity* 


1000k 




(MMC)t 








Mount Sinai Health System and Icahn School of Medicine (MSHS)t 


Epic 


4700k 




New York-Presbyterian Hospital (NYPH) 


Allscripts Sunrise 


1400k 




New York University Langone Medical Center and New York University 


Epic 


1800k 




School of Medicine (NYULMC)t 








Weill Cornell Medical College (WCMC)t 


Epic 


560k 


Research 


Biomedical Research Alliance of New York 


N/A 


N/A 


infrastructure 


Cornell NYC Tech Campus 
New York Genome Center 
Rockefeller Universityt 






HIE 


Bronx RHIO (Bronx Regional Informatics Center) 


Optum 


1650k 




Healthix 


InterSystems 
HealthShare 


7000k 


Patient organization 


American Diabetes Association 
Center for Medical Consumers 
Consumer Reports 
Cystic Fibrosis Foundation 
New York Academy of Medicine 
NYS Department of Health 


N/A 


N/A 



*Patients overlap and are for the period 1 August 2008-31 July 2013. 
t Denotes CTSA site. 

tMontefiore is replacing existing EHR platforms with Epic. 

CTSA, Clinical and Translational Science Award; EHR, electronic health record; HIE, health information exchange; N/A, not applicable; NYC-CDRN, New York City Clinical Data Research 
Network. 



Organizational structure 

The NYC-CDRN has created a multi-stakeholder organizational 
structure (figure 1) that includes leadership and participation 
from researchers, clinicians, and patients. 4-6 We have organized 
our work according to seven overarching goals. One committee 
leads each section and liaises with the other committees to col- 
laborate on cross-cutting issues. For example, the Technical 
Committee cannot develop its data model without input from 
researchers, clinicians, and patients in the Comparative 
Effectiveness Research (CER) and Patient and Engagement 
Committees. 

1. Create a strong governance and business infrastructure: The 
NYC-CDRN has a robust, collaborative governance and 
organizational model that operates the network in the inter- 
ests of all participants. The Governance Board oversees the 
entire project, sets policies in consultation with stakeholders 
and advisors, and ensures that all committees and stake- 
holders are on track to meet their deliverables. It addresses 
open issues within and among the committees, ensures 
common understanding of key network concepts and 



Advisory Council 



Governance Board 



Privacy and 
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Comparative 
Effectiveness 

Research 



Patient & Provider 
Engagement 



Figure 1 Organizational structure of NYC-CDRN (New York City 
Clinical Data Research Network). 



functions, and facilitates interactions with the healthcare 
systems among other functions. 

2. Ensure strong accountability and coordination among project 
committees and stakeholders-. The NYC-CDRN project is a 
complex endeavor with many moving, intersecting, and 
inter-dependent parts. The Operations Group has established 
a project management infrastructure to guide that activity. It 
drives, monitors, and reports progress; ensures quality and 
accountability across all stakeholders; and tracks adherence 
to milestones and timelines. 

3. Develop an overarching vision and sustainability: The 
NYC-CDRN reviews its strategy and vision with an Advisory 
Council of external healthcare leaders and subject matter 
experts. The Council ensures that the project benefits from 
new ideas, stays aligned with local and national develop- 
ments, and focuses on financial sustainability. 

4. Establish a legal foundation that protects patient privacy and 
security: All participating health systems have data sharing 
policies, IRB processes, and privacy and security policies in 
place. However, it would be a slow process for researchers 
interested in multi-site studies to obtain necessary approvals 
and negotiate separate policies and requirements individually 
from all IRBs. Thus, the project's Privacy and Security 
Group works with participants to agree to a common, con- 
sistent set of network processes, policies, and data sharing 
agreements. The participants have agreed to use a central 
IRB, housed at BRANY 

5. Engage patients and clinicians: This project relies on strong 
leadership and input from patients and clinicians in all its 
phases. Patients and clinicians participate in governance, 
inform and develop research questions, and ensure that the 
network's policies protect patient privacy and security. The 
Patient and Clinician Engagement Committee ensures that 
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Table 2 New York City population characteristics 


Characteristic % 


Age (years)* 




<19 


24 


20-44 


39 


45-64 


24 


65+ 


12 


Median 


36 


Race* 




White 


44 


Black 


26 


Hispanic/Latino* 


28 


Female* 


53 


% Household income <$25kt 


28 


% Publicly insuredt 


37 


% Self-reported diabetes* 


11 


% Self-reported high cholesterol* 


31 


% Self-reported current cholesterol medst 


37 


% Self-reported high blood pressure* 


29 


% Self-reported asthma* 


12 


% Overweight and/or obese* 


58 


% Receiving mental health medication 


4 


% Current smoker 


15 


*2010 Census. 

t2009 American Community Survey. 
*201 1 NYC Community Health Survey. 



all the other committees are identifying key policies and pro- 
cesses needing patient and clinician input. It also focuses on 
the collection of patient-reported outcomes. 

6. Embed research into practice: Participating institutions all have 
expertise and experience in embedding aspects of research into 
practice while minimizing disruption of healthcare delivery — 
identifying patients for research, implementing research proto- 
cols, monitoring activities, and disseminating research out- 
comes to improve practice. The CER Committee develops use 
cases for the network and ensures that the network facilitates 
different types of research designs, including retrospective 
studies, observational studies, and randomized clinical trials at 
the level of the individual and cohort. Community workgroups 
are being established to identify the best ways to engage 
patients in those communities and to inform research. 

7. Build the technical infrastructure of the research data 
network: In their initial 18 months, all CDRN projects must 
aggregate comprehensive, longitudinal data for at least 
1 million patients for research purposes. Given the number 
of institutions involved, it is a significant challenge to 
compile that data in a standard way, match and link patient 
identities across institutions, de-identify the records, and 



make available quality data. The project's Technical 
Committee oversees the design of the network architecture, 
the data model, and the design for the NYC -CDRN 
Informatics Center. These activities are described in more 
detail below. 



Patient population and selected cohorts 

PCORI CDRN awardees must focus on three conditions: a 
common condition, a rare condition, and obesity. NYC -CDRN 
has selected diabetes as its common condition and cystic fibrosis 
as its rare condition. According to the official city data, nearly 
60% of New Yorkers are either overweight (34%) or obese 
(24%), and 11% have diabetes (table 2). Cystic fibrosis is a 
genetic disease that affects the digestive and respiratory systems, 
and NYC -CDRN has identified over 5000 patients among its 
institutions. 

TECHNICAL APPROACH 

The NYC-CDRN's technical approach will employ an informa- 
tion model to document and manage the collection and trans- 
formation of clinical data, local institutional staging areas to 
transform and validate data, a centralized data processing facility 
to aggregate and share data, and use of common standards and 
tools. 

The NYC-CDRN Informatics Center, hosted at NYGC, will 
aggregate data from all the health systems centrally and make it 
available for research queries (figure 2). NYC-CDRN is being 
designed so that it is not constrained to a single technology or 
platform. It will utilize agile design and development with 
testing and iterative refinements as well as extensive quality 
controls. 

Informational model 

The NYC-CDRN will use a centrally defined information model 
and standardized set of terminologies to form the basis of data 
integration across institutions and for interoperability with 
PCORnet. Health systems will extract data from their EHRs or 
clinical data warehouse platforms according to a common set of 
vocabularies and then leverage existing models such as 
Observational Medical Outcomes Partnership (OMOP) to valid- 
ate mappings between standard vocabularies to integrate demo- 
graphics, ethnicity and race, diagnoses, procedures, medications, 
laboratory results, and other clinical elements. 7 By separating 
representation of concepts from data storage implementation, 
the information model will enable use of different technologies 
for distinct purposes. 

Local staging areas 

Health systems will host a local staging area for their data feeds. 
They will follow procedures defined by NYC-CDRN for 



Figure 2 NYC-CDRN (New York City 
Clinical Data Research Network) data 
flows. 
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standards-based mapping of health information, quality assur- 
ance, data cleaning, and validation prior to sending a limited 
dataset to the Informatics Center. Systems iteratively will contrib- 
ute data to the Informatics Center. For example, the first deliver- 
able is for institutions to contribute patient demographics 8 in a 
defined format followed by patient encounter data and then clin- 
ical observation data such as diagnoses, procedures, medications, 
and laboratory results as defined in the central information 
model. 

Centralized data processing facility 

The Informatics Center will aggregate each system's data into a 
patient-matched, de-duplicated central database and perform 
date shifting to preserve anonymity. This de-identified dataset 
will be available for query by investigators. 9 

It is critically important for a project like NYC-CDRN to 
match patients while preserving anonymity across multiple 
EHRs as a way of creating an integrated and complete view of 
longitudinal clinical data. 10 11 The Informatics Center will lever- 
age two health information exchanges' existing electronic 
master patient indices, patient matching algorithms, and patient 
de-duplication techniques provided by vendors (table 1) to align 
data contributed by systems to NYC-CDRN. 

The central database will link to other sources including 
public and commercial claims data; patient-reported and 
patient-generated data, including data actively collected through 
surveys and passively collected through mobile devices; genomic 
data allowing for novel links to biologic and molecular disease 
markers; and other publicly available data. 12 

DISCUSSION 

The NYC-CDRN is an ambitious project that has the potential 
to significantly change the research landscape in New York City 
and help shape national research efforts through the national 
PCORnet. To ensure our best chance of success, we abide by 
several guiding principles. 

First, we strive to make the network truly patient-centered. 
We conduct all our activities in a fashion that is guided by, and 
accessible and understandable to, patients, caregivers, and their 
care teams. Patients have a wealth of knowledge about their con- 
ditions and healthcare experience that can inform and inspire 
new research opportunities. 

Second, NYC-CDRN depends on the active and successful 
collaboration of many different institutions and individuals. By 
nurturing that collaboration effectively, we will have access not 
only to a great wealth of existing expertise and resources within 
our participating institutions but also to new ideas and initiatives 
created by the interaction of those parties, such as innovative 
research protocols, patient engagement methods, and technical 
models. 

Third, the network needs to scale easily. As NYC-CDRN 
builds a network of health systems, we must continue to add 
new partners and link to the national PCORnet network. 
NYC-CDRN will draw strength from its scale. 

Fourth, we strive to not over-complicate an already complex 
job. We endeavor to choose simple, elegant solutions wherever 
possible. For example, we are employing an iterative process to 
develop our data model — starting with small sections of the 
template, building, testing, and improving before expanding the 
dataset and moving on to new sections. 

Finally, we strive to streamline the administrative and regula- 
tory process to ensure that researchers can embark on critical 
research studies in a timely fashion, while ensuring the highest 
standards of patient safety and privacy. 



CONCLUSION 

With funding from PCORI, the NYC-CDRN is creating an access- 
ible, sustainable, scalable clinical data network that will enable 
patient-centered research embedded within the functioning health- 
care system, support a national research network, and facilitate the 
development of learning healthcare systems. The NYC-CDRN is 
well positioned to transform the research landscape in New York 
City and create new opportunities for wide-scale collaborations to 
design, conduct, and disseminate innovative clinical trials, CER, 
and population health management. 
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